Computationally Easy Outlier Detection via Projection Pursuit with Finitely Many Directions

نویسندگان

  • Satyaki Mazumder
  • Robert Serfling
چکیده

Outlier detection methods are fundamental to all of data analysis. They are desirably robust, affine invariant, and computationally easy in any dimension. The powerful projection pursuit approach yields the “projection outlyingness”, which is affine invariant and highly robust and does not impose ellipsoidal contours like the Mahalanobis distance approach. However, it is highly computationally intensive, being obtained by taking suprema of univariate scaled deviation outlyingness over all projections of the data onto lines. Here we introduce several outlyingness functions based on a vector of scaled deviations taken over only finitely many directions approximately uniform over the unit hypersphere. A preliminary transformation of the data to a strong invariant coordinate system makes such vectors affine invariant. We establish useful foundational theory for finite vectors of scaled deviations on projections. Also, using artificial and real data sets, we compare our affine invariant outlyingness functions with the usual projection outlyingness and with robust Mahalanobis distance outlyingness. AMS 2000 Subject Classification: Primary 62H99. Secondary 62G99

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Outlier Detection in Multivariate Time Series via Projection Pursuit

This article uses Projection Pursuit methods to develop a procedure for detecting outliers in a multivariate time series. We show that testing for outliers in some projection directions could be more powerful than testing the multivariate series directly. The optimal directions for detecting outliers are found by numerical optimization of the kurtosis coefficient of the projected series. We pro...

متن کامل

Outlier Detection in Multivariate Time Series by Projection Pursuit

In this article we use projection pursuit methods to develop a procedure for detecting outliers in a multivariate time series. We show that testing for outliers in some projection directions can be more powerful than testing the multivariate series directly. The optimal directions for detecting outliers are found by numerical optimization of the kurtosis coefficient of the projected series. We ...

متن کامل

Approximate Document Outlier Detection Using Random Spectral Projection

Outlier detection is an important process for text document collections, but as the collection grows, the detection process becomes a computationally expensive task. Random projection has shown to provide a good fast approximation of sparse data, such as document vectors, for outlier detection. The random samples of Fourier and cosine spectrum have shown to provide good approximations of sparse...

متن کامل

Nonparametric Estimation of Nonlinear Money Demand Cointegration Equation by Projection Pursuit Methods

Money demand equation continues to attract attention of econometricians with a new wrinkle provided by cointegration. We use projection pursuit (PP) regressions pioneered by Friedman and Stuetzle (1981) to suggest new estimates of partials of conditional expectations of the regressands with respect to the regressors and prove their consistency. Since the usual cointegration methodology involves...

متن کامل

Application of Recursive Least Squares to Efficient Blunder Detection in Linear Models

In many geodetic applications a large number of observations are being measured to estimate the unknown parameters. The unbiasedness property of the estimated parameters is only ensured if there is no bias (e.g. systematic effect) or falsifying observations, which are also known as outliers. One of the most important steps towards obtaining a coherent analysis for the parameter estimation is th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011